Site-specific incorporation of a fluorescent nucleobase analog enhances i-motif stability and allows monitoring of i-motif folding inside cells

Abstract The i-motif is an intriguing non-canonical DNA structure, whose role in the cell is still controversial. Development of methods to study i-motif formation under physiological conditions in living cells is necessary to study its potential biological functions. The cytosine analog 1,3-diaza-2-oxophenoxazine (tCO) is a fluorescent nucleobase able to form either hemiprotonated base pairs with cytosine residues, or neutral base pairs with guanines. We show here that when tCO is incorporated in the proximity of a G:C:G:C minor groove tetrad, it induces a strong thermal and pH stabilization, resulting in i-motifs with Tm of 39ºC at neutral pH. The structural determination by NMR methods reveals that the enhanced stability is due to a large stacking interaction between the guanines of the tetrad with the tCO nucleobase, which forms a tCO:C+ in the folded structure at unusually-high pHs, leading to an increased quenching in its fluorescence at neutral conditions. This quenching is much lower when tCO is base-paired to guanines and totally disappears when the oligonucleotide is unfolded. By taking profit of this property, we have been able to monitor i-motif folding in cells.

S-29 Table S4.Average dihedral angles and order parameters of the neutral structure of NN4_tC O 2.

Figure S7
.-Scheme of the most relevant NOE contacts found for NN4_tC O 2 at pH 7. Round dotted lines indicate G:C base pairs and square dotted lines stand for C:C + base pairs.C1:G16 base pair can be potentially formed but, as characteristics cross-peaks have not been observed, it is indicated in grey.

Assignment of NN4_tC
O 6 at neutral pH.Although the quality of these spectra is not enough to confirm the topology unambiguously, cross-peaks involving thymine connecting loop residues pointed to a head-to-tail folding with the thymine loop located near the major groove, as observed for other sequences of the family.Starting from tC O 6, considering stacking connections for the fragment T5-tC O 6-C7-G8-T9 and assuming a head-to-tail topology, the rest of the assignment could be carried out.The four cytosines involved in hemiprotonated base pairs could be matched as C7:C15 + (15.40 ppm) and C2:C20 + (15.38 ppm).Accordingly, the fragment C14-C15-G16-T17 could be identified.The observed contacts between tC O 6 and a cytosine not involved in hemiprotonated base pairs could be assigned to C19H3'/H5-tC O 6H8/H9, and would correspond to major groove contacts between tetrads residues, previously observed in this type of structures.From C19, the fragment C19-C20-G21-T22 could be completed.The remaining fragment, C1-C2-G3-T4, was assigned based on the formation of base C2:C20 + pair.Only some of the thymine residues of the connecting loop could be assigned tentatively.Imino signals of hemiprotonated base pairs are clearly detected but characteristic G:C imino-amino cross-peaks are poorly observed.Iminos signals of G3, G8, G16 could be assigned on the basis of stacking cross-peaks with C2, C7 and C15, respectively.Although no cross-peak between tC O 6H10 proton and a guanine imino proton was observed, the chemical shift of this H10 proton (11.17 ppm) indicates its involvement in hydrogen bond formation.

Assignment of NN4_tC
O 6 at acidic pH.At these conditions, tC O 6 residue is forming an hemiprotonated base pair that, according to a number of amino-H2'/H2" contacts across the major groove between 3'-3' intercalated pairs (Figure S15), is located in the core of the C:C + stack.Thus, starting from tC O 6 signals, C14, C1 and C19 could be specifically assigned.Imino-imino cross-peaks between intercalated base pairs also allowed identifying C7:C15 + pair.Although due to the overlapping of NOE signals in the non-exchangeable region, C7 and C15 could not be specifically assigned, the observation of a cross-peak C7H41-C19H5 allowed the identification of the amino protons of C7.C20 residue was tentatively assigned based on an unusual H5 chemical shift (5.34 ppm) that may indicate, as previously discussed, the interaction across the major groove with tC O 6 aromatic system.Four TH3-GH1 cross-peaks are also observable, suggesting the formation of G:T:G:T tetrads.Guanine imino signals could be specifically assigned on the basis of amino-imino cross-peaks with the stacked cytosines.

Figure
Figure S11.-Exchangeable and tC O protons regions of NOESY (150 ms) of NN4_tC o 2 at pH 5, T=5 o C. 10 mM sodium phosphate buffer (H2O/D2O 90:10), [oligonucleotide] = 0.43 mM.Up to 11 protonated imino-amino protons cross-peaks are found in the 15 ppm region.Formation of tC O 2:C20 + base pair corresponding to the neutral form is still clearly observed but at least two new C:C + base pairs are also formed (blue and green lines).In the aromatic region, two set of tC O protons (labelled in black (neutral form) and blue (acidic form)), respectively are clearly observed.

Figure
Figure S12.-Exchangeable protons regions of NOESY (150 ms) of NN4_tC O 2 at pH 4, T=5 o C. 10 mM sodium phosphate buffer (H2O/D2O 90:10), [oligonucleotide] = 0.43 mM.Signals corresponding to the acidic form are shown in blue.Some signals corresponding to the neutral form are still observed (black labels).

Figure
Figure S13.-Non-exchangeable protons regions of NOESY (150 ms) of NN4_tC o 2 at pH 4, T=5 o C. 10 mM sodium phosphate buffer (H2O/D2O 90:10), [oligonucleotide] = 0.43 mM.Signals corresponding to the acidic form are shown in blue.Some signals corresponding to the neutral form are still observed (black labels).

Figure S17 .
Figure S17.QM calculations.A) Calculated HOMO LUMO energies values (eV) with B3LYP method and 6-31+G(d) basis set in water (PCM model).tC O+ and C + represent the protonated states of tC O and C. G:C is the base pair stacked with tC O .G:C:G:C and G:T:G:T are the tetrads from the neutral and acidic states.B) Models used to calculate the stability the different protonation states of the tC O :C base pair.Calculations were performed at the B3LYP/6-311G(d) DFT level of theory.

Figure
Figure S20.-Summary of the different equilibria observed in NN4 and its modified analogs with pHT values obtained by monitoring the CD spectra at 265 and/or 295 nm.

Figure S24 .
Figure S24.Fluorescence emission of transfected HeLa cells at acidic pH (6.5), compared with non-transfected cells and cells treated with lipofectamine 2000.Fluorescence quantification values have been normalized with respect to the average of the signal recorded for untreated cells.

Figure S25 .
Figure S25.Fluorescence emission of transfected HeLa cells at physiological pH (7.4), compared with non-transfected cells and cells treated with lipofectamine 2000.Fluorescence quantification values have been normalized with respect to the average of the signal recorded for untreated cells.

Figure S26 .
Figure S26.Fluorescence emission of transfected HeLa cells at alkaline pH (8.5), compared with non-transfected cells and cells treated with lipofectamine 2000.Fluorescence quantification values have been normalized with respect to the average of the signal recorded for untreated cells.

Table S2 .
-Chemical shifts of non-exchangeable protons of NN4_tC o 2 at pH 7 and T= 20 o C. n.a.: not assigned, n.o.: not observed

Table S5 .
-Pseudorotation angle and amplitude values that allow the description of the puckering of the ribose ring in the calculated structure of NN4_tC O 2.